[rocsolver] Use enqueue_native_command ext when avail #582
+67
−52
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This makes use of the enqueue_native_command dpc++ extension if it is available. This improves performance and integrates correctly with the dpc++ scheduler.
The implementation is very similar to the cusolver part of #572.
See #572 for further details of this extension.
Since I made a small change to the standard host_task implementation to add a missing sync in the batch functions, I also attach tests (marked "*host_task") for the case that the extension macro for enqueue_native_command isn't defined.
tests:
test_main_lapack_rt_native_command_amd.txt
test_main_lapack_ct_native_command_amd.txt
test_main_lapack_ct_host_task_amd.txt
test_main_lapack_rt_host_task_amd.txt